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Abstract 

This  paper  improves  on  previous  rates  at  which  lag  lengths  are  allowed  to  grow  for  consistent 
covariance  matrix  estimation  with  heterogeneous  dependent  data.  Using  a  WLLN,  we  give  a  con- 
sistency result  for  growth  rates  ofofo1/3);  the  previous  rate  was  o(n}'4).  This  new  rate  equals  that 
of  Berk's  autoregressive  spectral  density  estimator  for  well-behaved  stationary  contexts,  and  thus 
may  be  best  possible  outside  of  very  special  cases. 

1.  Introduction 

Estimating  consistent  covariance  matrices  is  one  of  the  most  common  problems  confronting  the  applied 
researcher.  The  need  to  do  this  arises  in  econometric  work  ranging  from  Euler  equation  estimation  by 
GMM  methods  (Hansen  and  Singleton  [1982])  to  tests  for  integration  and  cointegration  (Phillips  [1987], 
Phillips  and  Perron  [1988],  Phillips  and  Ouliaris  [1988],  and  Stock  [1988]).  Thus  the  results  of  Hansen 
[1982]  (for  stationary  data),  and  White  [1984],  White  and  Domowitz  [1984],  and  Newey  and  West  [1987]  (for 
heterogeneous  dependent  data)  on  consistent  covariance  matrix  estimation  have  been  very  widely  applied. 

Newey  and  West  [1987]  adapted  results  in  White  [1984]  and  White  and  Domowitz  [1984]  to  obtain  a  class 
of  non-negative  definite  consistent  covariance  matrix  estimators  for  dependent  non-iid  data.  Their  correction 
to  arguments  in  White  [1984]  6.19  led  to  a  o(nllA)  rate  for  increasing  lag  length  to  preserve  consistency. 
The  same  rate  appears  in  Gallant  and  White  [1988]  6.18,  and  has  been  used  quite  generally  (see  for  instance 
Phillips  [1987]  and  Phillips  and  Perron  [1988]). 

This  paper  improves  that  rate  to  o(n1/'3),  the  same  as  that  for  Berk's  autoregressive  (time  domain) 
spectral  density  estimator  for  strictly  stationary  data.  Except  for  very  special  cases,  this  new  rate  of  o(n1'3) 
may  be  the  best  possible.  Since  the  choice  of  lag  length  is  often  one  of  the  most  troubling  (and  seemingly 
arbitrary)  to  applied  researchers,  this  rate  improvement  should  permit  greater  flexibility  without  risking  the 
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loss  of  consistent  inference. 

Note  that  this  o(n1/3)  rate  is  exactly  that  originally  in  the  conclusion  of  White's  [1984]  Theorem  6.20. 
As  Newey  and  West  [1987]  have  pointed  out  however,  this  result  did  not  follow  from  White's  proof.  This 
paper  therefore  uses  a  different  argument  to  re-establish  that  result,  under  essentially  the  same  regularity 
assumptions.  The  proof  is  remarkably  straightforward. 

2.  Notation 

The  p-norm  of  a  random  variable  (rv)  X  defined  on  a  given  probability  space  (O,  T\  Pr)  is  denoted  ||A"||p  = 
Ellp\X\p .  The  random  variable  that  is  the  absolute  value  of  A'  is  denoted  \X\.  Recall  that  {A'j,  *  >  1}  is  said 
to  be  a-mixing  of  size  —q  if  the  a- mixing  coefficients  tend  to  zero  and  satisfy  am  =  0(mx)  for  some  A  <  —q, 
so  that  ^m  Om  <  oo.  Similarly  X  is  said  to  be  <j>-mixing  of  size  —q  if  its  (^-mixing  coefficients  satisfy 
analogous  conditions.  See  for  example  Gallant  and  White  [1988].  Let  <j>  denote  the  ^-mixing  coefficients 
when  Xt  is  reversed  in  time;  let  <f>^  =  max(^m,^).  When  X  is  Gaussian  and  covariance  stationary, 
</>+  =  (j)m  =  (j>R  for  all  m.  It  will  be  convenient  below  to  place  restrictions  on  <f>+ . 
We  will  need  the  following: 

Lemma  2.1  (Davydov's  Inequality):   For  all  p,  q  such  that  i  +  i  <  1, 

\EXtXt_j  -  EXtEXt_j\  <  15aJ~'~*||X|||p  •  ||*t_,-||,. 


See  for  example  Philipp  [1986,  p. 241]  Lemma  3.1  for  this  form  of  Davydov's  result. 
Lemma  2.2  (Peligrad's  Inequality):  For  all  p,  q  such  that  - -\- -  —  1, 

\EXtXt-j  -  EXtEXt-i\  <  2(^)1/p(^f)1/9  ll^illp  \\Xt-j\\,. 

I 

This  was  first  obtained  in  Peligrad  [1983],  and  improves  by  (<t>f)1'q  on  the  earlier  long-standing  inequality. 
White  [1984]  6.16  is  a  special  case  of  2.1;  when  X  is  covariance  stationary,  2.2  is  a  strict  improvement  on 
White  [1984]  6.16. 
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3.   Results 

The  first  result  is  a  weak  law  of  large  numbers  ( WLLN)  for  a  process  that  fails  the  usual  weak  dependence 
assumptions  for  mixingales  (and  thus  for  mixing  sequences  as  well).  Further  the  process  will  have  growing 
first  absolute  moments  so  that  it  is  not  an  L'-mixingale  (Andrews  [19S8]). 

We  give  the  regularity  assumptions  in  two  sets,  one  set  of  assumptions  on  the  process  itself,  and  the 
other  on  a  set  of  weights. 

Assumption  3.1:  Suppose  {Xt,  t  >  1}  on  (£2,  T,  Pr)  satisfies  EXt  =  0  for  allt,  and  assume  further  that 
for  some  r  >  1:  (i.)  sup,  ||Xf||4r  <  oo;  and  (ii.)  either  (a.)  Xt  is  a-mixing  of  size  — 2r/(r  —  1)  or  (b.)  Xt  is 
</>+ -mixing  of  size  —2.  I 

For  convenience  in  notation,  let  A'<  =  0  for  all  t  <  0. 

Assumption  3.2:  Suppose  wn(j),  with  n  >  1,  j  >  0  is  a  double  array  of  uniformly  bounded  non-negative 
weights  such  that  as  n  — *  oo,  we  have  wn(j)  — ►  1  for  each  j.  I 

These  assumptions  are  essentially  those  in  Newey  and  West  [1987]  Theorem  2,  or  White  [1984]  Theorem  6.20 
where  applicable. 

The  first  result  is  a  WLLN  for  dependent  double  arrays  that  will  be  used  below. 

Theorem  3.3:   Assume  (3.1)  and  (3.2),  and  define  the  double  array  of  rv's: 

Znt  ^  Yl  "»0')  {XtXt-j  -  EXtXt-j) 

j=0 

for  some  sequence  of  nonnegative  integers  l(n),  with  l(n)  =  o(n1/'3).  Then  the  double  array  {Znt\  satisfies 
n~l  Y^t=i  %nt  — +  0  as  n  — >  oo.  I 

Remarks 

1.  Notice  that  if  l(n)  j  oo,  Znt  has  stronger  long  term  dependence  than  does  a  mixingale.   Further,  Z„t 
will  have  growing  moments  for  l(n)  increasing  with  n. 

2.  Clearly  the  conclusion  remains  true  if  l(n)  is  fixed,  or  if  Znt  is  defined  to  exclude  the  j  =  0  term. 

3.  Our  improved  rate  derives  from  using  this  WLLN  below  in  place  of  the  implication  rule  as  in  White's 
[1984]  proof  of  his  Theorem  6.20. 
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4.  By  first  giving  this  WLLN,  it  should  be  clear  that  our  proof  differs  from  that  in  White  [1984]  Chapter 
6,  by  a  change  in  the  order  of  summation. 

The  principal  result  is  convergence  in  probability  for  a  weighted  estimator  of  Var  (n-1/2  Y^i  =  i  A'j).  We 
state  this  as  follows: 

Theorem  3.4:   Assume  (3.1)  and  (3.2),  and  let  l(n)  be  a  sequence  of  positive  integers  such  that  as  n  — *  oo, 
l(n)  1  oo,  and  l(n)  =  o(?i1^3).  Then  as  n  — »  oo, 


n                         l(n)                       n 

n-1 

£A't2  +  2J>„(j)   £  XtXt-i 

_«=i              j=i           t=j+i 

Var  U-1/2X)A'«      ^  °' 


t=i 


Remarks 

1.  Apply  Theorem  3.4  to  the  proof  of  Theorem  2  in  Ncwey  and  West  [1987]  to  argue  convergence  of  the 
second  and  third  terms  in  their  expression  (9).  The  other  terms  similarly  converge  to  zero  by  l(n)  f  oo 
and  l(n)  =  o(nll3).  The  result  here  therefore  implies  the  same  conclusion  as  their  Theorem  2,  but  with 
greater  flexibility  in  choice  of  lag  length  (o(n1/'3)  instead  of  o(nllA)). 

2.  Similarly,  the  results  in  Chapter  6  of  Gallant  and  White  [1988]  remain  intact  with  their  Assumption  TL 
changed  to  mn  —  o(n1/'3)  from  the  typographically  incorrect  0(nllA)  on  p. 101. 

3.  The  Newey-West  result  is  widely  used  in  applications:  see  for  example  Phillips  [1987]  Theorem  4.2, 
Phillips  and  Perron  [1987],  and  Phillips  and  Ouliaris  [1988],  and  elsewhere.  Those  results  therefore  all 
hold  with  an  even  more  flexible  choice  for  the  lag  length. 

4.  The  rate  o(n1'3)  is  also  that  used  in  autoregressive  spectral  density  estimation  under  assumptions  on 
X  that  include  strict  stationarity,  absolute  summability  of  the  Wold  moving  average  coefficients,  and 
finite  fourth  moments  on  the  iid  innovations  (e.g.  Berk  [1974]  Theorem  1). 

5.  Fuller  [1976,  Theorem  7.2.3]  and  Anderson  [1971,  Chapter  9]  imply  that  a  o(n)  rate  can  be  used  for  the 
strictly  stationary  case.  It  may  in  fact  be  possible  to  adapt  the  "unraveling"  method  used  there  for  the 
nonstationary  mixing  situation  considered  here. 


4.   Proofs 

In  the  sequel,  the  symbols  K  and  K'  will  denote  arbitrary  finite  constants,  not  necessarily  the  same  through- 
out. 

Our  first  result  is  a  WLLN.  It  is  convenient  to  give  the  proof  in  two  parts,  the  first  part  is  a  variance 
bound  which  may  be  useful  in  other  applications.  This  bound  is  also  that  part  of  the  results  that  gives  the 
binding  restriction  on  the  mixing  and  moment  conditions  (3.1);  thus  if  further  improvement  is  forthcoming, 
it  is  likely  to  obtain  by  giving  a  sharper  inequality  here. 

Proof  of  Theorem  3.3:  First  bound  Var  (£t=i  Znt).  Write  Var  (£t=i  zm)  =  \T,"=i  E?=i  EZntZns \ . 
Decompose  this  double  sum  of  products  into  products  close  together,  and  products  far  apart.  By  the 
triangle  inequality, 

Vail^ZnA  <^2       £       \EZntZnt\  +  J2       £       \EZntZn.\. 

\<  =  1  /  *      |«-<|<2/(n)  t      |i-<|>2l(n) 

Consider  the  first  summand.  For  s,t  between  1  and  n,  the  Cauchy-Schwarz  inequality  implies: 

\EZntZnt\  <  \\Znt\\2  ■  \\Zn,\\2  <    sup    \\Znt\\l 

l<t<n 

For  each  t,  there  are  at  most  4/(n)  +  1  points  s  for  which  \s  —  t\  <  2/(n).  Thus  the  first  summand  satisfies: 

£       J2       \EZntZns\<n-(4l(n)  +  l)-    sup    \\Znt\\22. 

t     |,-<|<2;(n)  !<(<n 

Next  consider  the  second  summand.  For  \s  —  t\  >  2/(n),  Davydov's  Inequality  implies: 
\EZntZn,\  <  15a,1";  •  \\Znt\\2r  ■  \\Zns\\2r  <  15a,1";  •    sup   \\Znt\\22r, 
while  Peligrad's  Inequality  implies: 

\EZniZns\<24>+.\\Znl\\2-\\Znt\\2<2<t>fM-   sup    \\Znt\\\. 

y    '  y    '      l<(<n 

There  are  at  most  n  —  4/(n)  —  1  points  s  such  that  \s  — 1\  >  2l(n).  Thus  the  second  summand  obeys: 
Y,       J2       \EZntZnl\<n{n-Al{n)-l).lba)~\  ■    sup    \\Znt\\\r  <  n2  ■  lba)~\  ■    sup    \\Znt\\2r. 

t     |,-l|>2/(n)  ^'^n  ^(^n 


Similarly,  the  second  summand  also  satisfies: 


^       J2       \EZntZn,\<n2.2<i>+nySup    \\Znt\\l 

t      |,-t|>2/(n) 


Kt<n 


For  any  p  such  that  2  <  p  <  2r,  Minkowski's  inequality  gives  ||^ni||p  <  S/=o  u,n(i)ll-Y'^-j  -  EXt^t-j\\p- 
By  3.1.1,  supt  sup^  \\XtXt-j  -  2£XtXt_/ ||p  <  oo.  Further,  since  wn(j)  is  uniformly  bounded,  we  have  that 
for  some  finite  constant  K,  \\Znt\\p  <  K  ■  (l(n)  +  1)  =>  ||Znt||2  <  A'2  •  (/(n)  +  l)2.  Using  this  for  p  =  2  and  2r 
in  the  first  and  second  summands,  conclude  that  there  must  exist  some  finite  constant  K'  such  that: 

Var  I JT  Znt  J  <  A"  [n  ■  (4/(n)  +  1)  •  (/(n)  +  l)2  +  n2(l(n)  +  l)2  •  a,1"*  } 


\(=i 


and 


Var(|>n(j  <A"{n.(4/(n)  +  l).(/(n)  +  l)2  +  n2(/(n)  +  l)2.^n)}. 

If  3.1.ii.a,  then  a,  =  O  (jA)  for  some  A  <  -2r/(r  -  1),  so  that  a,1,"?  =  O  (l(n)xj  for  some  A'  <  -2,  or 
/(n)2^,"?  =  o(l).  Similarly,  if  3.1.ii.b,  then  l(n)2<f>un)  =  °(1)«  ^ut  then,  using  Chebyshev's  inequality,  for 
any  e  >  0, 


Pr 


n-^^Zn.l^c 


(=i 


<  5  {"_i  •  («(«) + 1)  •  aw + 1)2 + c(n) + 1)2  •  «?(-)*} 


and 


Pr 


»-1|SZ»«l^^ 


t=i 


<  ^  {n-1  ■  (4/(n)  +  1)  •  (Z(n)  +  l)2  +  (/(n)  +  l)2  •  *+n)}  . 


Given  /(n)  =  0(71^),  and  3. l.ii,  one  or  the  other  of  the  right  hand  sides  above  tend  to  zero  as  n  —*  oo. 


Therefore,  as  n  — ►  oo,  n    1  ^2t=1  ^«t  — *  0 


Q.E.D. 


Next,  turn  to  the  main  result  (3.4).  The  proof  of  this  is  an  abbreviation  and  modification  of  ideas  in 
Newey  and  West  [1987]  and  White  [1984]  Chapter  6.  The  crucial  difference  is  in  replacing  White's  (corrected) 
Lemma  6.19,  the  implication  rule,  and  an  early  use  of  Chebyshev's  Inequality  with  our  WLLN  for  double 
arrays  (3.3).  With  this  WLLN,  (3.4)  follows  in  a  remarkably  straightforward  way. 

Proof  of  Theorem  3.4:    Proceed  in  two  steps.    First,  argue  the  expected  version  with  truncation  and 
weighting  differs  from  Var(n-1/'2  ^2"=i  ^t)  by  a  quantity  that  vanishes  as  n  — ►  oo.  Then  show  the  feasible 
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estimator  converges  in  probability  to  its  expectation.  Begin  with  the  expected  version: 


,-i 


'(") 


^£Xt2  +  2][>n(j)   J2  EXtXt-j 


'(») 


i=i 


t=j  +  i 


■Var  L-'/^I, 


n  —  1  n 


=  -  X>«0')  - 1)  E  EX<X<-*  -  -  E    E  £*,*,_,, 

>  =  1  t=j  +  l  j=l(u)+lt=j  +  l 

By  Davydov's  Inequality, 

\EXtXt_j\  <  15a]~*||Xt||4r  •  ||^t-j||4r, 


and  by  Peligrad's  Inequality, 


\EXtXt_j\<2<l>+n)\\Xt\\2.\\Xt-j\\2. 


By  3.1,  sup,  ||A'(||4r  <  oo  and  sup,  \\Xt\\2  <  oo,  so  that: 


'(») 


'(») 


'(«) 


and  similarly, 


|n_1EK(j)  -  1)   E   EXtXt-j\  <  K^2\MJ)  ~  l|<n)- 


From  3.1-ii,  we  have  that  either  X^i  a/  ^  <  oo  or  YlTLii^Un))^  <  °°  wnicn  imply  either  YlTLi  ay  ^  <  oo 
or  ^"1j_i  <i>i(  \  <  oo,  respectively.  Since  w„(j)  —*  1  for  each  j,  the  dominated  convergence  theorem  then 
implies  that  Yl}2l  \wn(j)  ~  Maj  ^  — *  0  or  ^/=i  \wn(j)  ~  ^tln)  ~ *  ®  as  n  ~ *  °°-  ^v  a  similar  argument, 
we  have  that: 


n  — 1  n 


"_1     E       E   EXtXt-i 
J=l(n)+l«=i+l 


< 


*  E  (^R*^  E  -J"* 

j=f(r>)+l   V  '  j=I(n)+l 


and 


n  — 1  n 


"_1  E    E  ^«^«-i 

j=I(n)+l  t=J+l 


<K     E     <»)• 

i=/(n)+l 


Again,  if  3.1.ii.a,  Y^jL\  Qj  2r  converges  to  a  finite  quantity.  Similarly,  if  3.1.ii.b,  J27Li  ^Un)  conver8es  to  a 
finite  quantity.  In  either  case,  this  implies  the  respective  right  hand  side  above  converges  to  0,  provided  that 
l(n)  |  oo  as  n  — >  oo.  This  completes  the  first  part  of  the  proof. 

Second,  we  show  the  required  convergence  in  probability.  The  estimator 


,-i 


£xt2+2][>n(j)    £  X<X<-i 
«=i  j=i  t=j+i 
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differs  from  its  expectation  by: 


n  '(")  n 

^(A'(2  -  EX?)  +  2  j>„(j)   J2  {XtXt-i  ~  EXtXt_i) 
(=1  i=i  (=i+i 

For  ease  of  notation,  define  Xt  =  0  for  all  t  <  0,  and  rearrange  orders  of  summation  in  the  second  term: 

l(n)  n  n     '(") 

2n~l  J2  MJ)    E  (A'<A'<->  -  EXtXt.j)  =  2U-1  £  E  »«0W  A''-i  -  £A<At-;)- 
j=i  t=j+i  t=ij=i 

Define  the  double  array  of  rv's  Znt  =  5Z;  "i  wn{j)(Xt Xt-j  —  EXtXt-j),  and  apply  Theorem  3.3  to  it.  The 
term  n_12  X^  =  i  wn(j)  Yl" =j  +  i(A< %t-j  —  EXtXt-j)  therefore  converges  in  probability  to  zero.  Similarly 
apply  Theorem  3.3  (with  l(n)  =  0)  to  X?  —  EX?,  so  that  the  first  term  n-1  53"=i(A<2  —  EX?)  converges  in 
probability  to  zero  as  well.  This  completes  the  proof.  Q.E.D. 

Notice  that  the  first  part  of  the  proof  uses  considerably  weaker  mixing-moment  conditions  than  necessary 
for  the  second  part  of  the  proof,  which  is  essentially  a  repeated  application  of  Theorem  3.3.  Thus  an 
improvement  in  the  lag  growth  rate,  if  forthcoming,  might  most  easily  be  found  by  obtaining  a  sharper 
inequality  than  that  available  in  the  proof  of  Theorem  3.3. 
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